S-SEER: Selective Perception in a Multimodal Office Activity Recognition System

نویسندگان

Nuria Oliver

Eric Horvitz

چکیده

The computation required for sensing and processing perceptual information can impose significant burdens on personal computer systems. We explore several policies for selective perception in SEER, a multimodal system for recognizing office activity that relies on a cascade of Hidden Markov Models (HMMs) named Layered Hidden Markov Model (LHMMs). We use LHMMs to diagnose states of a user’s activity based on real-time streams of evidence from video, audio and computer (keyboard and mouse) interactions. We review our efforts to employ expected-value-of-information (EVI) to limit sensing and analysis in a context-sensitive manner. We discuss an implementation of a greedy EVI analysis and compare the results of using this analysis with a heuristic sensing policy that makes observations at different frequencies. Both policies are then compared to a random perception policy, where sensors are selected at random. Finally, we discuss the sensitivity of ideal perceptual actions to preferences encoded in utility models about information value and the cost of sensing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of HMMs and Dynamic Bayesian Networks for Recognizing Office Activities

We present a comparative analysis of a layered architecture of Hidden Markov Models (HMMs) and dynamic Bayesian networks (DBNs) for identifying human activites from multimodal sensor information. We use the two representations to diagnose users’ activities in S-SEER, a multimodal system for recognizing office activity from realtime streams of evidence from video, audio and computer (keyboard an...

متن کامل

Proposal for a Deep Learning Architecture for Activity Recognition

Activity recognition from computer vision plays an important role in research towards applications like human computer interfaces, intelligent environments, surveillance or medical systems. In this paper, we propose a gesture recognition system based on a deep learning architecture and show how it performs when trained with changing multimodal input data on an Italian sign language dataset. The...

متن کامل

Multimodal signaling in fowl, Gallus gallus.

Many social birds produce food-associated calls. In galliforms, these vocalizations are typically accompanied by a distinctive visual display, creating a multimodal signal known as tidbitting. This system is ideal for experimental analysis of the way in which signal components interact to determine overall efficacy. We used high-definition video playback to explore perception of male tidbitting...

متن کامل

Selective deficits in human audition: evidence from lesion studies

The human auditory cortex is the gateway to the most powerful and complex communication systems and yet relatively little is known about its functional organization as compared to the visual system. Several lines of evidence, predominantly from recent studies, indicate that sound recognition and sound localization are processed in two at least partially independent networks. Evidence from human...

متن کامل